Normalization of Gender, Dialect and Speaking style using Probabilistic front-ends
نویسندگان
چکیده
This paper analyzes the capability of probabilistic Multilayer Perceptron (MLP) front-end to perform various normalizations for robust Automatic Speech Recognition (ASR). We find decision trees to be a useful tool for investigating the normalization of the feature space achieved by various front-ends. We introduce additional questions for different environmental conditions to the training of the phonetic context decision tree, and count the number of splits dedicated to lexical discrimination using context, and to these environmental conditions. We compare (1) BottleNeck (BN) features and (2) standard stacked Mel Frequency Cepstral Coefficients (MFCC) with LDA. In previous work, we found the BN front-end to be effective in reducing the number of gender questions than MFCC, which may be part of the reason why BN front-ends can achieve significant improvements. In this work, we extend this approach to the analysis of dialect on a large database of Pan-Arabic speech.
منابع مشابه
Percentage of Consonants Correct for 3-5 Years Old Kurdish-Speaking Children With Middle Kurmanji-Mukryani Dialect
Objectives: The present research aims to study the normal development of Percentage of Consonant Correct (PCC) in Kurdish-speaking children, with Middle Kurmanji-Mukryani Dialect as an Articulation Competency Index (ACI). PCC was examined in terms of the manner of articulation and position of sound in the word. Methods: In this descriptoanalytical cross-sectional study, 120 Kurdish-speak...
متن کاملFeature Level Compensation for Robust Speaker Identification in Mismatched Conditions
In this paper, robust front end features are proposed for improvement in speaker identification (SI) performance by considering the factors of real world situations, like mismatch between training and testing conditions. The most commonly used MFCC features are very much sensitive to effects such as channel and environment mismatch. Characteristics of speech gets changed with room acoustics, ch...
متن کاملDialect Variation in Speaking Rate
The difference in speaking rates among American English regional dialects have been assumed and become popular belief in U.S. culture without supporting evidence to prove or disprove it. This study compares the speaking rates of those in south-central Wisconsin and western North Carolina in order to see if southerners do, in fact, speak more slowly than northerners. Age and gender are also comp...
متن کاملPerceptual compensation for differences in speaking style
It is well-established that listeners will shift their categorization of a target vowel as a function of acoustic characteristics of a preceding carrier phrase (CP). These results have been interpreted as an example of perceptual normalization for variability resulting from differences in talker anatomy. The present study examined whether listeners would normalize for acoustic variability resul...
متن کاملTranscribing radio news
We have recently extended the capabilities of BBN's large vocabulary discrete-utterance speech recognition system (BYBLOS) to operate on raw audio recordings of radio news programming. The recordings are given to the system as large monolithic waveforms without any additional sideinformation. Our goal is to transcribe all speech in the input with the highest accuracy possible. The problem is ve...
متن کامل